get_best()max_movement variable. It is always 4x resolution. In the above example, it is 16.display_grid_of_images(grid_of_images, grid_of_names)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
As seen above, the pyramid method works well for most of the images, however, in some of them, expecially ones with repeated patterns, it has found local minima.
Offsets are of the format: ec_list_of_pyramid_offsets [[best green x, best green y, best green angle], [best red x, best red y, best red angle]]
for i in range(len(list_of_files)):
name = list_of_files[i]
name = name[5:-4]
print(f"{name} offset: {list_of_pyramid_offsets[i]}")
cathedral offset: [[60, 2, 0], [61, 8, 0]] church offset: [[4, 24, 0], [248, 64, 0]] emir offset: [[24, 48, 0], [44, 60, 0]] harvesters offset: [[16, 60, 0], [16, 124, 0]] icon offset: [[16, 40, 0], [24, 88, 0]] lady offset: [[8, 56, 0], [12, 112, 0]] melons offset: [[12, 84, 0], [16, 180, 0]] monastery offset: [[2, -3, 0], [60, 5, 0]] onion_church offset: [[28, 52, 0], [36, 108, 0]] self_portrait offset: [[28, 76, 0], [36, 176, 0]] three_generations offset: [[16, 52, 0], [12, 112, 0]] tobolsk offset: [[3, 3, 0], [52, 10, 0]] train offset: [[8, 44, 0], [32, 88, 0]] workshop offset: [[0, 52, 0], [-12, 104, 0]]
In this section I:
So far, we are trying to match them to minimize the difference between the arrays of each color. However, this would only work well with lots of contrast in brightness, and not a lot of saturation. In images where there is more vibrant colors, and less contrast in terms of brightness, the similarity between the color channels would mean nothing. This can be seen in the results of tobolsk, melon, self portrait, emir, and cathedral (in the case of cathedral, green and red were matched well to each other but not to blue.)
This is not a problem for humans, because although I can't speak for other humans, if I were to do this task by hand I would match the edges instead of the colors.
In this section I attempt to use magnitude of gradient to find the edges.
Code below. First I test the method in monastary, which is one of the jpg's that the original method didn't work well in.
monastery = read_img('data/monastery.jpg')
display_dictionary_of_images(monastery, 'monastery', horiz=True)
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:9: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations if __name__ == '__main__':
monastery_edges = convert_dict_to_edges(monastery)
display_dictionary_of_images(monastery_edges, 'monastery edges', horiz = True, w = 8, h = 7)
This seems to work very well for this little jpg. Next, I will test it to see if
melons = read_img('data/melons.tif')
melons_edges = convert_dict_to_edges(melons)
display_dictionary_of_images(melons_edges, 'melons edges', horiz = True, w = 8, h = 7)
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:9: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations if __name__ == '__main__':
display_dictionary_of_images(scale_down(melons_edges, 32), 'melons edges 1/32 scale', horiz = True, w = 8, h = 7)
Clearly, these are very dark - even the one that wasn't deliberately scaled down was scaled automatically to fit the screen. However, the important thing is that the edges are still existent and distinguishable. Whether it's easy or hard to see with the naked eye shouldn't matter much to a machine.
This is another idea that is based on how humans (or at least me) align images. When we are aligning the edges, we don't care about most of the image - we only care that the lines match, and we pay no attention to the blackspace at all. Thus, an elementwise product is a better representation of how a human judges the quality of an alignment.
So I wrote tensor_product_score, which takes the sum of an elementwise product of the two arrays it is comparing.
This clipping uses a simple algorithm of taking the max offset for each dimension, and clipping both sides for that dimension.
Also, I no longer check angles - In each of the original images, the best angle was 0, so there's really no point.
display_grid_of_images(ec_grid_of_images, ec_grid_of_titles, w = 7, h = 6)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).